Performance Evaluation of Parallel Sparse Matrix-Vector Products on SGI Altix3700
نویسندگان
چکیده
The present paper discusses scalable implementations of sparse matrix-vector products, which are crucial for high performance solutions of large-scale linear equations, on a cc-NUMA machine SGI Altix3700. Three storage formats for sparse matrices are evaluated, and scalability is attained by implementations considering the page allocation mechanism of the NUMA machine. Influences of the cache/memory bus architectures on the optimum choice of the storage format are examined, and scalable converters between storage formats shown to facilitate exploitation of storage formats of higher performance.
منابع مشابه
A generic interface for parallel cell-based finite element operator application
We present a memory-efficient and parallel framework for finite element operator application implemented in the generic open-source library deal.II. Instead of assembling a sparse matrix and using it for matrix-vector products, the operation is applied by cell-wise quadrature. The evaluation of shape functions is implemented with a sum-factorization approach. Our implementation is parallelized ...
متن کاملA Shared Memory Parallel Implementation of Block-Circulant Preconditioners
The parallel numerical solution of large scale elliptic boundary value problems is discussed. We analyze the parallel complexity of two block-circulant preconditioners when the conjugate gradient method is used to solve the sparse linear systems arising from such problems. A simple general model of the parallel performance is applied to the considered shared memory parallel architecture. Estima...
متن کاملBenchmarking Performance of Parallel Computers Using a 2d Elliptic Solver
It was recently shown that block-circulant preconditioners applied to a conjugate gradient method used to solve structured sparse linear systems arising from 2D elliptic problems have very good numerical properties and a potential for good parallel efficiency. The aim of the presentation is to summarize and compare their parallel performance across a number of modern parallel computers: SGI Pow...
متن کاملPerformance Characterization of Matrix Multiplication on SGI Altix 3700
Matrix multiplication is widely used in a variety of applications and is often one of the core components of many scientific computations which includes graph theory, numerical methods, digital control and signal processing. Multiplication of large matrices require a lot of computation time as its complexity is O(n), where n is the dimension of the matrix. A serial algorithm to compute large ma...
متن کاملBlocked-based sparse matrix-vector multiplication on distributed memory parallel computers
The present paper discusses the implementations of sparse matrix-vector products, which are crucial for high performance solutions of large-scale linear equations, on a PC-Cluster. Three storage formats for sparse matrices compressed row storage, block compressed row storage and sparse block compressed row storage are evaluated. Although using BCRS format reduces the execution time but the impr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005